Thomson Legal and Regulatory at NTCIR-4: Monolingual and Pivot-Language Retrieval Experiments

نویسنده

  • Isabelle Moulinier
چکیده

Thomson Legal and Regulatory participated in the CLIR task of the NTCIR-4 workshop. We submitted formal runs for monolingual retrieval in Japanese, Chinese and Korean. Our bilingual runs from Chinese and Korean to Japanese rely on English as a pivot language. During our monolingual experiments, we compared building stopword lists using query logs to building stopword lists from collection statistics with further manual editing. We investigated decompounding for Korean, more precisely partial credit of compound parts. Finally we incorporated pseudo-relevance feedback in our Japanese runs. Our bilingual approach was an experiment to construct a system within a short timeframe using publically available resources. The low quality of retrieval suggests that such an approach is not viable in a real environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Thomson Legal and Regulatory at NTCIR-5: Japanese and Korean Experiments

Thomson Legal and Regulatory participated in the CLIR task of the NTCIR-5 workshop. We submitted formal runs for monolingual retrieval in Japanese and Korean, as well as for bilingual English-to-Japanese retrieval. We employed enhanced tokenization for our Japanese and Korean runs and applied a novel selective pseudo-relevance feedback scheme for Japanese. Our bilingual search participation was...

متن کامل

Thomson Legal and Regulatory at NTCIR-3: Japanese, Chinese and English Retrieval Experiments

Thomson Legal and Regulatory participated in the CLIR task of the NTCIR-3 workshop. We submitted formal runs for monolingual retrieval in Japanese and Chinese, and for bilingual retrieval from English to Japanese. Our main focus was in Japanese retrieval. We compared word-based and character-based indexing, as well as query formulation using characters and character bigrams. Our results show th...

متن کامل

NTCIR-4 Chinese, English, Korean Cross Language Retrieval Experiments Using PIRCS

In NTCIR-4 we participated in Korean, Chinese, English monolingual, Chinese-English, EnglishKorean bilingual, and Chinese-Korean cross language (using English as pivot) retrieval tasks based on our PIRCS retrieval system. The query translation approach was employed for CLIR. We combined two MT translations for Chinese-English, and two for English-Korean. For the latter, a webbased entity-orient...

متن کامل

Report on Thomson Legal and Regulatory Experiments at CLEF-2004

Thomson Legal and Regulatory participated in the CLEF-2004 monolingual and bilingual tracks. Monolingual experiments included Portuguese, Russian and Finnish. We investigated a new query structure to handle Finnish compounds. Our main focus was bilingual search from German to French. Our approach used query translation and post-translation pseudo-relevance feedback. We compared two translation ...

متن کامل

NTCIR-5 Chinese, English, Korean Cross Language Retrieval Experiments using PIRCS

In NTCIR-5 our focus is to see if web-assisted query expansion is useful, and to test an EnglishKorean bilingual dictionary. We participated in Chinese, Japanese, Korean and English monolingual retrieval using also web expansion for Chinese and English. We also performed Chinese-English, English-Chinese, English-Korean bilingual, and Chinese-Korean pivot bilingual CLIR. The query translation ap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004